Data Augmentation (DA) is frequently used to automatically provide additional training data without extra human annotation. However, data augmentation may introduce noisy data that impairs training. To guarantee the quality of augmented data, existing methods either assume no noise exists in the augmented data and adopt consistency training or use simple heuristics such as training loss and diversity constraints to filter out ``noisy'' data. However, those filtered examples may still contain useful information, and dropping them completely causes loss of supervision signals. In this paper, based on the assumption that the original dataset is cleaner than the augmented data, we propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data. A simple self-regularization module is applied to force the model prediction to be consistent across two distinct dropouts to further prevent overfitting on noisy labels. Our method can be applied to augmentation techniques in general and can consistently improve the performance on both text classification and question-answering tasks.
translated by 谷歌翻译
As language models (LMs) scale, they develop many novel behaviors, good and bad, exacerbating the need to evaluate how they behave. Prior work creates evaluations with crowdwork (which is time-consuming and expensive) or existing data sources (which are not always available). Here, we automatically generate evaluations with LMs. We explore approaches with varying amounts of human effort, from instructing LMs to write yes/no questions to making complex Winogender schemas with multiple stages of LM-based generation and filtering. Crowdworkers rate the examples as highly relevant and agree with 90-100% of labels, sometimes more so than corresponding human-written datasets. We generate 154 datasets and discover new cases of inverse scaling where LMs get worse with size. Larger LMs repeat back a dialog user's preferred answer ("sycophancy") and express greater desire to pursue concerning goals like resource acquisition and goal preservation. We also find some of the first examples of inverse scaling in RL from Human Feedback (RLHF), where more RLHF makes LMs worse. For example, RLHF makes LMs express stronger political views (on gun rights and immigration) and a greater desire to avoid shut down. Overall, LM-written evaluations are high-quality and let us quickly discover many novel LM behaviors.
translated by 谷歌翻译
Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clipping}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the gradient of each neural network layer is clipped separately, allows clipping to be performed in conjunction with backpropagation in differentially private optimization. This results in private learning that is as memory-efficient and almost as fast per training update as non-private learning for many workflows of interest. While per-layer clipping with constant thresholds tends to underperform standard flat clipping, per-layer clipping with adaptive thresholds matches or outperforms flat clipping under given training epoch constraints, hence attaining similar or better task performance within less wall time. To explore the limits of scaling (pretrained) models in differentially private deep learning, we privately fine-tune the 175 billion-parameter GPT-3. We bypass scaling challenges associated with clipping gradients that are distributed across multiple devices with \emph{per-device clipping} that clips the gradient of each model piece separately on its host device. Privately fine-tuning GPT-3 with per-device clipping achieves a task performance at $\epsilon=1$ better than what is attainable by non-privately fine-tuning the largest GPT-2 on a summarization task.
translated by 谷歌翻译
Neural networks are susceptible to data inference attacks such as the membership inference attack, the adversarial model inversion attack and the attribute inference attack, where the attacker could infer useful information such as the membership, the reconstruction or the sensitive attributes of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a method, namely PURIFIER, to defend against membership inference attacks. It transforms the confidence score vectors predicted by the target classifier and makes purified confidence scores indistinguishable in individual shape, statistical distribution and prediction label between members and non-members. The experimental results show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency, outperforming previous defense methods, and also incurs negligible utility loss. Besides, our further experiments show that PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks. For example, the inversion error is raised about 4+ times on the Facescrub530 classifier, and the attribute inference accuracy drops significantly when PURIFIER is deployed in our experiment.
translated by 谷歌翻译
目前缺乏利用对象关系的目前有效的基于LIDAR的检测框架,这些框架自然而然地以空间和时间的方式存在。为此,我们引入了一个简单,高效且有效的两阶段检测器,称为RET3D。 RET3D的核心是利用新颖的框架内和框架间关系模块,以相应地捕获空间和时间关系。更具体地说,框内关系模块(Intrarm)将框架内对象封装到稀疏图中,从而使我们能够通过有效的消息传递来完善对象特征。另一方面,框架间关系模块(Interm)密集地将每个对象动态地连接到相应的跟踪序列中,并利用此类时间信息以通过轻量级变压器网络有效地增强其表示形式。我们使用基于中心的或基于锚的探测器实例化Intram和Interm的新颖设计,并在Waymo Open数据集(WOD)上对其进行评估。由于额外的额外开销可忽略不计,RET3D实现了最先进的性能,就1级1和2级MAPH指标而言,在车辆检测方面分别比最近的竞争对手高出5.5%和3.2%。
translated by 谷歌翻译
在本文中,我们介绍了DA $^2 $,这是第一个大型双臂灵敏性吸引数据集,用于生成最佳的双人握把对,用于任意大型对象。该数据集包含大约900万的平行jaw grasps,由6000多个对象生成,每个对象都有各种抓紧敏度度量。此外,我们提出了一个端到端的双臂掌握评估模型,该模型在该数据集的渲染场景上训练。我们利用评估模型作为基准,通过在线分析和真实的机器人实验来显示这一新颖和非平凡数据集的价值。所有数据和相关的代码将在https://sites.google.com/view/da2dataset上开源。
translated by 谷歌翻译
医疗图像合成引起了人们的关注,因为它可能会产生缺失的图像数据,改善诊断并受益于许多下游任务。但是,到目前为止,开发的合成模型并不适应显示域移位的看不见的数据分布,从而限制了其在临床常规中的适用性。这项工作着重于探索3D图像到图像合成模型的域适应性(DA)。首先,我们强调了分类,分割和合成模型之间DA的技术差异。其次,我们提出了一种基于近似3D分布的2D变异自动编码器的新型有效适应方法。第三,我们介绍了有关适应数据量和关键超参数量的影响的经验研究。我们的结果表明,所提出的方法可以显着提高3D设置中未见域的合成精度。该代码可在https://github.com/winstonhutiger/2d_vae_uda_for_3d_sythesis上公开获得。
translated by 谷歌翻译
在计算机视觉中,面对域转移是很常见的:具有相同类但采集条件不同的图像。在域适应性(DA)中,人们希望使用源标记的图像对未标记的目标图像进行分类。不幸的是,在源训练集中训练的深度神经网络在不属于训练领域的目标图像上表现不佳。改善这些性能的一种策略是使用最佳传输(OT)在嵌入式空间中对齐源和目标图像分布。但是,OT会导致负转移,即与不同标签的样品对齐,这导致过度拟合,尤其是在域之间存在标签移动的情况下。在这项工作中,我们通过将其解释为针对目标图像的嘈杂标签分配来减轻负相位。然后,我们通过适当的正则化来减轻其效果。我们建议将混合正则化\ citep {zhang2018mixup}与噪音标签强大的损失,以提高域的适应性性能。我们在一项广泛的消融研究中表明,这两种技术的结合对于提高性能至关重要。最后,我们在几个基准和现实世界DA问题上评估了称为\ textsc {mixunbot}的方法。
translated by 谷歌翻译
我们可以将异源图结构与文本结合在一起以学习高质量的语义和行为表示吗?图形神经网络(GNN)S编码数值节点属性和图形结构,以在各种监督的学习任务中实现令人印象深刻的性能。当前的GNN方法受到文本特征的挑战,文本特征通常需要编码为数值向量,然后再提供给GNN,这可能会导致一些信息损失。在本文中,我们提出了一个有效有效的框架,称为语言模型GNN(LM-GNN),以共同训练大型语言模型和图形神经网络。我们的框架中的有效性是通过首先使用异质图信息,然后使用GNN模型应用BERT模型的阶段微调来实现的。提出了几种系统和设计优化,以实现可扩展有效的培训。 LM-GNN可容纳节点和边缘分类以及链接预测任务。我们在不同数据集的性能中评估了LM-GNN框架,并展示了所提出方法的有效性。 LM-GNN在亚马逊查询购买应用程序中提供竞争结果。
translated by 谷歌翻译
通常针对具有特定模型的特定输入而生成的对抗性示例,对于神经网络而言是无处不在的。在本文中,我们揭示了对抗声音的令人惊讶的属性,即,如果配备了相应的标签,则通过一步梯度方法制作的对抗性噪声是线性分离的。从理论上讲,我们为具有随机初始化条目的两层网络和神经切线内核设置证明了此属性,其中参数远离初始化。证明的想法是显示标签信息可以有效地反向输入,同时保持线性可分离性。我们的理论和实验证据进一步表明,对训练数据的对抗噪声进行训练的线性分类器可以很好地对测试数据的对抗噪声进行分类,这表明对抗性噪声实际上将分布扰动注入了原始数据分布。此外,我们从经验上证明,当上述条件受到损害时,在它们仍然比原始功能更容易分类时,对抗性的噪声可能会变得线性分离。
translated by 谷歌翻译